\section{Our technique}

\subsection{General description}
\label{sec-our-technique-general-description}

Our technique consists of applying \emph{local graph rewriting} to the
graph of instructions in intermediate code.  Local graph rewriting has
the advantage of being simple, both to implement and when it comes to
proving correctness.

For the purpose of this paper, we assume that some initial phase has
determined that the following conditions are respected:

\begin{enumerate}
\item there are two instructions, $D$ and $I$, in the program that
  are identical tests,
\item the variable being tested is the same in $D$ and $I$,
\item $D$ dominates $I$, and
\item the variable being tested is not assigned to in any path from
  $D$ to $I$.
\end{enumerate}

In a real compiler, such a phase probably does not exist.  Some
conditions are easier to verify if the compiler translates the
intermediate code to SSA form \cite{Cytron:1989:EMC:75277.75280,
  Cytron:1991:ECS:115372.115320}, and some conditions can be verified
during the execution of our technique, avoiding the need to include
them in a separate phase.  The last condition may require an analysis
for detecting variables that are assigned to in nested functions.
Such a phase is present in most optimizing compilers in order to
determine where to allocate space for such variables.

In \refFig{fig-example-naive}, the two test instructions labeled $D$
and $I$ respectively verify the conditions listed above, and we will
use these two instructions to illustrate our technique.

During the execution of our algorithm, the instruction $I$ will be
\emph{replicated}, so that it is part of some set $S$ in which every
replica remains dominated by $D$.  Initially, $S$ contains $I$ as its
only element.

In our technique, we keep track of the outcome of the test in the
\emph{control arcs} of the control-flow graph.  We can think of this
information as being represented as \emph{labels} associated with
control arcs:

\begin{itemize}
\item An arc is unlabeled if we have no information concerning the
  outcome of the test at that point in the program.
\item An arc is labeled \emph{true} if the outcome of the test at that
  point in the program is known to be true.
\item An arc is labeled \emph{false} if the outcome of the test at
  that point in the program is known to be false.
\end{itemize}

Initially, only the outgoing arcs of $D$ and $I$ have a label.

Our technique involves the repeated application of the first
applicable rewrite rule in the following list to some arbitrary
element of $S$, say $s$, that does not itself have an immediate
predecessor in the control-flow graph that is also an element of $S$:

\begin{enumerate}
\item If $s$ has no predecessors, then remove it from $S$.
\item If $s$ has an incoming arc labeled \emph{true}, then change the
  head of that arc so that it refers to the successor of $s$ referred
  to by the outgoing arc of $s$ labeled \emph{true}.
\item If $s$ has an incoming arc labeled \emph{false}, then change the
  head of that arc so that it refers to the successor of $s$ referred
  to by the outgoing arc of $s$ labeled \emph{false}.
\item If $s$ has $n>1$ predecessors, then replicate $s$ $n$ times;
  once for each predecessor.  Every replica is inserted into $S$.
  Labels of outgoing control arcs are preserved in the replicas.
\item Let $p$ be the (unique) predecessor of $s$.  Remove $p$ as a
  predecessor of $s$ so that existing immediate predecessors of $p$
  instead become immediate predecessors of $s$.  Insert a replica of
  $p$ in each outgoing control arc of $s$, preserving the label of
  each arc.
\end{enumerate}

\noindent
Rewrite rules are applied until the set $S$ is empty, or until each
element of $S$ has an immediate predecessor in the control-flow graph
that is also a member of $S$.  An element of $S$ could have an
immediate predecessor like that if the dominated instruction $I$ were
part of a loop.  We need to exclude such elements, or else our
technique might not terminate in all cases.

\subsection{A simple example}
\label{sec-our-technique-example}

Let us see how our technique works on the example in
\refFig{fig-example-naive}.  The initial situation is shown in
\refFig{fig-rewrite-1}.  The instructions that are members of $S$ are
drawn with a slightly thicker box.

\begin{figure}
\begin{center}
\inputfig{fig-rewrite-1.pdf_t}
\end{center}
\caption{\label{fig-rewrite-1}
Initial instruction graph.}
\end{figure}

As \refFig{fig-rewrite-1} shows, the second \texttt{consp} is
dominated by the first, so it becomes the only member of the set $S$.
The last rewrite rule applies to the second \texttt{consp} so that the
\texttt{setq} is replicated as its successors.  The result of this
first rewrite is shown in \refFig{fig-rewrite-one-and-a-half}.

\begin{figure}
\begin{center}
\inputfig{fig-rewrite-one-and-a-half.pdf_t}
\end{center}
\caption{\label{fig-rewrite-one-and-a-half}
Result after one rewrite.}
\end{figure}

As we can see in \refFig{fig-rewrite-one-and-a-half}, the last rewrite
rule applies again resulting in the replication of the \texttt{call}.
The result after the second rewrite is shown in
\refFig{fig-rewrite-2}.

\begin{figure}
\begin{center}
\inputfig{fig-rewrite-2.pdf_t}
\end{center}
\caption{\label{fig-rewrite-2}
Result after two rewrites.}
\end{figure}

As we can see in \refFig{fig-rewrite-2}, the second \texttt{consp}
now has two predecessors, and both incoming arcs are unlabeled.
Therefore, rewrite rule number $4$ applies and the \texttt{consp} is
replicated.  As a result, $S$ now has two members.  The result of
applying this rule is shown in \refFig{fig-rewrite-3}.

\begin{figure}
\begin{center}
\inputfig{fig-rewrite-3.pdf_t}
\end{center}
\caption{\label{fig-rewrite-3}
Result after replicating the test.}
\end{figure}

We now choose the leftmost replica of the second \texttt{consp} to
apply our rules to.  It has a single predecessor with an unlabeled
incoming control arc, so the last rewrite rule applies.  We replicate
the \texttt{setq} in both branches of the test, giving us the result
shown in \refFig{fig-rewrite-4}.

\begin{figure}
\begin{center}
\inputfig{fig-rewrite-4.pdf_t}
\end{center}
\caption{\label{fig-rewrite-4}
Result after replicating \texttt{setq}.}
\end{figure}

In \refFig{fig-rewrite-4}, the last rewrite rule applies again, and we
replicate the \texttt{cons-car}, giving us the situation shown in in
\refFig{fig-rewrite-5}.

\begin{figure}
\begin{center}
\inputfig{fig-rewrite-5.pdf_t}
\end{center}
\caption{\label{fig-rewrite-5}
Result after replicating \texttt{cons-car}.}
\end{figure}

As \refFig{fig-rewrite-5} shows, the \texttt{consp} instruction now
has a single predecessor, but the incoming arc has a known outcome of
the test, namely \textit{true}.  Therefore, rewrite rule number $2$
applies.  The left outgoing arc of the first \texttt{consp} is
redirected to go directly to the \texttt{cons-car} instruction.  The
result of applying this rule is shown in \refFig{fig-rewrite-6}.

\begin{figure}
\begin{center}
\inputfig{fig-rewrite-6.pdf_t}
\end{center}
\caption{\label{fig-rewrite-6}
Result after short-circuit \texttt{consp}.}
\end{figure}

At this point, the \texttt{consp} that we have been processing has no
predecessor.  Therefore we apply rule number $1$ and remove it from
$S$.  Removing all instructions that can not be reached from the start
instruction gives the situation shown in \refFig{fig-rewrite-7}.

\begin{figure}
\begin{center}
\inputfig{fig-rewrite-7.pdf_t}
\end{center}
\caption{\label{fig-rewrite-7}
Result after removing unreachable instructions.}
\end{figure}

Analyzing \refFig{fig-rewrite-7}, we can see that if the result of the
first \texttt{consp} yields \emph{true}, then no second test is
performed.  Instead, the variable \texttt{a} is set to the result of
the instruction \texttt{cons-car}, the variable \texttt{b} is set to
the result of the call, and the variable \texttt{c} is set to the
result of the instruction \texttt{cons-cdr}.  Applying the same rules
to the remaining \texttt{consp} instruction in $S$ and then to the
second \texttt{null} instruction (which is now dominated by the
first), yields the final result shown in  \refFig{fig-rewrite-8}.

\begin{figure}
\begin{center}
\inputfig{fig-rewrite-8.pdf_t}
\end{center}
\caption{\label{fig-rewrite-8}
Final result.}
\end{figure}

This example represents a control graph that is particularly simple,
in that there are no loops between the first and the second
\texttt{consp} instructions.  Our technique must obviously work no
matter the complexity of the control graph, as long as the first test
dominates the second.

\subsection{Proof of correctness and termination}

The correctness of our technique is easy to prove, simply because each
rewrite rule preserves the semantics of the program.  The last rewrite
rule preserves the semantics only under certain circumstances which
are easy to verify:

\begin{itemize}
\item The predecessor does not assign to a lexical variable that is
  read by the test instruction.  This condition is respected because
  we have assumed that the variable being tested is not assigned to
  in any path between the first and the second occurrences of the
  test, as condition number $4$ in
  \refSec{sec-our-technique-general-description} requires.
\item The predecessor must not have any other side effect that may
  alter the outcome of the test.  By restricting the test to lexical
  variables, this restriction is also respected.
\end{itemize}

Termination is a bit harder to prove.  One way is to find some
non-negative \emph{metric} that can be shown to strictly decrease as a
result of the application of each rewrite rule.  We have not found any
such metric.  However, this conundrum can be avoided by a simple
\emph{grouping} of the rewrite rules.  This grouping is not required
to be present in the implementation of our technique, only in the
termination proof.

To see how the rewrite rules can be grouped, consider a general case
where the test instruction has some arbitrary number of labeled or
unlabeled incoming control arcs.  Rules number $2$ and $3$ are first
applied a finite number of times.  What happens next depends on the
number $n$ of unlabeled incoming control arcs:

\begin{itemize}
\item If $n=0$ the first rewrite
rule applies, in which case the instruction is removed from the set
$S$.
\item If $n=1$, the last rewrite rule is applied.  The crucial
  characteristic of this rewrite rule is that the total number of
  unlabeled control arcs decreases by one.
\item If $n>1$, rewrite rule number $4$ is applied.  Notice that the
  number of unlabeled control arcs is not modified by the application of
  this rule.
\end{itemize}

For the purpose of this proof, we assume that the individual rewrite
steps in a group happen immediately after each other, so that for a
particular instruction, the labeled incoming control arcs are first
eliminated, the same instruction is then potentially replicated, and
finally, the last rewrite rule is applied to one of the replicas.
However, the implementation does not have to work that way in order
for termination to be certain.

In other words, we can create groups of rewrite steps, where a group
can be formed according to one of the following \emph{group types}:

\renewcommand\theenumi {\Alph{enumi}}
\begin{enumerate}
\item A group in this type has a finite number of applications of
  rewrite rules number $2$ and $3$, followed by a single application
  of rewrite rule number $1$.
\item A group in this type has a finite number of applications of
  rewrite rules number $2$ and $3$, followed by a single application
  of rewrite rule number $5$.
\item A group in this type has a finite number of applications of
  rewrite rules number $2$ and $3$, followed by a single application
  of rewrite rule number $4$, followed by a single application of
  rewrite rule number $5$.
\end{enumerate}

With this information, we can create a metric consisting of a pair
$(U,N)$, where $U$ is the total number of unlabeled control arcs of
the program and $N$ is the number of elements of the set $S$.  Two
pairs can now be compared using a \emph{lexicographic order}, so that
for two pairs $(U_1,N_1)$ and $(U_2,N_2)$, $(U_1,N_1)$ is
\emph{strictly smaller than} $(U_2,N_2)$, written $(U_1,N_1) <
(U_2,N_2)$, if and only if either $U_1 < U_2$ or $U_1 = U_2$ and $N_1
< N_2$.

%Theorem: 
\begin{theorem}
The rewrite algorithm terminates.
\end{theorem}

\begin{proof}
As a result of a rewrite according to a group of type A, $U$
remains the same, but $N$ decreases by $1$.  As a result of a rewrite
according to a group of type $B$ or $C$, $U$ decreases by $1$ (but $N$
may increase).  Since $U$ and $N$ are both non-negative integers, we
must reach a normal form after a finite number of rewrites.
\end{proof}

The following table illustrates how the grouping technique applies to
our example:
\vskip 3mm
\begin{tabular}{|l|r|r|c|l|r|r|}
\hline
Initial & $U$ & $N$ & Group & Final & U & N\\
\hline
\refFig{fig-rewrite-1} & 11 & 2 & B &
\refFig{fig-rewrite-one-and-a-half} & 10 & 2\\
\refFig{fig-rewrite-one-and-a-half} & 10 & 2 & B &
\refFig{fig-rewrite-2} & 9 & 2\\
\refFig{fig-rewrite-2} & 9 & 2 & C & \refFig{fig-rewrite-4} & 8 & 3\\
\refFig{fig-rewrite-4} & 8 & 3 & B & \refFig{fig-rewrite-5} & 7 & 3\\
\refFig{fig-rewrite-5} & 7 & 3 & A & \refFig{fig-rewrite-7} & 7 & 2\\
\hline
\end{tabular}

\vskip 3mm
Each row in the table represents a group of rewrites.  For each
rewrite group, we give the figure representing the initial state with
the values of $U$ and $N$ for that figure, followed by the figure
representing the final state with the values of $U$ and $N$ after the
application of the rewrite steps in the group.
